Spamming in Linked Data

نویسندگان

  • Ali Hasnain
  • Mustafa Al-Bakri
  • Luca Costabello
  • Zijie Cong
  • Ian Davis
  • Tom Heat
چکیده

The rapidly growing commercial interest in Linked Data raises the prospect of “Linked Data spam”, which we define as “deliberately misleading information (data and links) published as Linked Data, with the goal of creating financial gain for the publisher”. Compared to conventional technologies affected by spamming, e.g. email and blogs, spammers targeting Linked Data may not be able to push information directly towards consumers, but rather may seek to exploit a lack of human involvement in automated data integration processes performed by applications consuming Linked Data. This paper aims to lay a foundation for future work addressing the issue of Linked Data spam, by providing the following contributions: i) a formal definition of spamming in Linked Data; ii) a classification of potential spamming techniques; iii) a sample dataset demonstrating these techniques, for use in evaluating anti-spamming mechanisms; iv) preliminary recommendations for anti-spamming strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy Concerns of FOAF-Based Linked Data

In this position paper, we introduce a potential problem that arises with the emergence of publicly-available, FOAF-based linked data. The problem allows a spammer to send context-aware spam, which has a high clickthrough rate. Unlike online profiles within social networks, FOAF-based structured data provides a more reliable and accessible “food” for spammers and attackers. Current solutions (e...

متن کامل

ICTNET at Web Track 2010 Spam Task

Web Spamming refers those web pages deceive search engines so as to get a higher rank in their search result. We work on the data set TrecWeb09, based on a content-based spamming classifier, to check the two ends of a hyperlink; if the two end pages either is content spamming, or both are not so good, then the hyperlink will be discarded. After all hyperlinks have been checked, PageRank value s...

متن کامل

Environmental Factors Impacting Spam: An Initial Study

Spam is a source of serious concern for both e-mail users and Internet Service Providers (ISP). While prior research has focused on spam content and spam filtering techniques, this study focuses on country-level, macro-environmental conditions that facilitate spamming activity. Adopting a criminological perspective, this study draws upon the deviance-based theories of rational choice and routin...

متن کامل

Associated Pagerank: A Content Relevance Weighted Pagerank Algorithm

Pagerank algorithm is a link analysis approach to evaluate the importance of web pages, and there are many techniques to improve the traditional Pagerank algorithm to prevent from the biases of link spamming in recent years. A key challenge for link analysis is to identify the relevance between the original page and the linked page. The importance scores of web pages should rely on the quality ...

متن کامل

Analysis of Spamming Threats and Some Possible Solutions for Online Social Networking Sites (OSNS)

In this paper we are presenting some spamming techniques their behaviour and possible solutions. We have analyzed how Spammers enters into online social networking sites (OSNSs) to target them and diverse techniques used by them for this purpose. Spamming is very common issue in present era of Internet especially through Online Social Networking Sites (like Facebook, Twitter, and Google+ etc.)....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012